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ABSTRACT 


We  propose  a  class  of  algorithms  for  finding  an  optimal  quasistatic 
routing  in  a  communication  network.  The  algorithms  are  based  on  Gallager's 
method  [1],  Their  main  feature  is  that  they  utilize  second  derivatives 
of  the  objective  function  and  may  be  viewed  as  approximations  to  a  con¬ 
strained  version  of  Newton's  method.  The  use  of  second  derivatives  results 
in  improved  speed  of  convergence  and  automatic  stepsize  scaling  with 
respect  to  level  of  traffic  input.  These  advantages  are  of  crucial 
importance  for  the  practical  implementation  of  the  algorithm  using  dis¬ 
tributed  computation. 
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1.  Introduction 

We  consider  the  problem  of  optimal  routing  of  messages  in  a  communication 
network  so  as  to  minimize  average  delay  per  message.  We  primarily  have  in 
mind  a  situation  where  the  statistics  of  external  traffic  inputs  change 
slowly  with  time  as  described  in  the  paper  by  Gallager  [1].  While  algo¬ 
rithms  of  the  type  to  be  described  can  also  be  used  for  centralized  com¬ 
putation,  we  place  primary  emphasis  on  algorithms  that  are  well  suited 
for  distributed  computation 

Two  critical  requirements  for  the  success  of  a  distributed  routing 
algorithm  are  speed  of  convergence  and  relative  insensitivity  of  performance 

to  variations  in  the  statistics  of  external  traffic  inputs.  Unfortunately 
the  algorithm  of  [1]  is  not  entirely  satisfactory  in  these  respects. 

In  particular  it  is  impossible  to  select  in  this  algorithm  a  stepsize  that 
will  guarantee  convergence  and  good  rate  of  convergence  for  a  broad  range 
of  external  traffic  inputs.  The  work  described  in  this  paper  was  motivated 
primarily  by  this  consideration. 

A.  standard  approach  for  improving  the  rate  of  convergence  and 
facilitating  stepsize  selection  in  optimization  algorithms  is  to  scale 
the  descent  direction  using  second  derivatives  of  the  objective  function 
as  for  example  in  Newton’s  method.  This  is  also  the  approach  taken  here. 

On  the  other  hand  the  straightforward  use  of  Newton's  method  is  inappropriate 
for  our  problem  primarily  because  of  large  dimensionality.  We  have  thus 
introduced  various  approximations  to  Newton's  method  which  exploit  the 
network  structure  of  the  problem  and  facilitate  distributed  computation. 

In  Section  2  we  describe  a  broad  class  of  algorithms  for  minimum 
delay  routing.  This  class  is  patterned  after  a  gradient  projection  method 
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for  nonlinear  programming  [2], [3]  as  explained  in  [4],  and  contains  as  a 
special  case  Gallager's  original  algorithm  except  for  a  variation  in  the 
definition  of  a  blocked  node  [compare  with  equation  (15)  of  [1]].  This 
variation  is  essential  in  order  to  avoid  unnecessary  complications  in  the 
statement  and  operation  of  our  algorithms  and  despite,  its  seemingly  minor 
significance,  it  has  necessitated  a  major  divergence  in  the  proof  of  con¬ 
vergence  from  the  corresponding  proof  of L  - 1  * 

Section  3  describes  in  more  detail  a  particular  algorithm  from  the 
class  of  Section  2.  This  algorithm  employs  second  derivatives  in  a 
manner  which  approximates  a  constrained  version  of  Newton ''s  method  [3] 
and  is  well  suited  for  distributed  computation. 

The  algorithm  of  Section  3  seems  to  work  well  for  most  quasistatic 
routing  problems  likely  to  appear  in  practice  as  extensive  computational 
experience  has  shown  [5],  However  there  are  situations  where  the  unity 
stepsize  employed  by  this  algorithm  may  be  inappropriate.  In  Section  4 
we  present  another  distributed  algorithm  which  automatically  corrects  this 
potential  difficulty  whenever  it  arises  at  the  expense  of  additional  com¬ 
putation  per  iteration.  This  algorithm  also  employs  second  derivatives, 
and  is  based  on  minimizing  at  each  iteration  a  suitable  upper  bound  to  a 
quadratic  approximation  of  the  objective  function. 

Proofs  of  convergence  have  been  relegated  to  Appendices. 

Both  algorithms  of  Sections  3  and  4  have  been  tested  extensively  and 
computational  results  have  been  documented  in  [5]  and  [6] .  These 
results  substantiate  the  a?"  .ions  made  here  regarding  the  practical 

properties  of  the  algorithms.  There  are  also  other  related  second 


derivative  algorithms  [7], [8]  that  operate  in  the  space  of  path  flows 
and  exhibit  similar  behavior  as  the  ones  of  this  paper.  These  algorithms 
are  well  suited  for  centralized  computation  and  virtual  circuit  networks 

but,  in  contrast  with  the  ones  of  the  present  paper,  require  global  information 
at  each  node  regarding  the  network  topology  and  the  total  flow  on  each  link. 


We  finally  mention  that  while  we  have  restricted  attention  to  the 
problem  of  routing,  the  algorithms  of  this  paper  can  be  applied  to  other 
problems  of  interest  in  communication  networks.  For  example  problems  of 
optimal  adaptive  flow  control  or  combined  routing  and  flow  control  have 
been  formulated  in  [9], [10]  as  nonlinear  multicommodity  flow  problems  of 
the  type  considered  here,  and  the  algorithms  of  this  paper  are  suitable 
for  their  solution. 
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2.  A  Class  of  Routing  Algorithms 

Consider  a  network  consisting  of  N  nodes  denoted  by  1,2,...,N  and  L 
directed  links.  The  set  of  links  is  denoted  by  L  .  We  denote  by  (i,£) 
the  link  from  node  i  to  node  £  and  assume  that  the  network  is  connected 
in  the  sense  that  for  any  two  nodes  m,n  there  is  a  directed  path  from  m  to 
n.  The  flow  on  each  link  (i,£)  for  any  destination  j  is  denoted  by  f.^(j). 
The  total  flow  on  each  link  (i,£)  is  denoted  by  F  i.e. 

Fu  ■  X 

1=1 

The  vector  of  all  flows  f^fj),  (i,£)eL,  j  =  is  denoted  by  f. 

We  are  interested  in  numerical  solution  of  the  following  multicommodity 
network  flow  problem: 

minimize  £  (MFP) 

subject  to  l  f-p(j)  -  l  f.U)  *  r  (j), 

£eO(i)  mel(i)  m  1 

Vi  =  1,...,N,  i  ^  j 

>  0,  V(i,£)eL  ,  i  =  1, •  •  •  *N,  j  =  1,...,N 

fj£^5  =  °’  V(j,i)eL  ,  j  »  1,...,N, 

where,  for  i  i  j,  r - ( j )  is  a  known  traffic  input  at  node  i  destined  for  j, 
and  0(i)  and  I(i)  are  the  sets  of  nodes  i  for  which  (i,£)eL  and  (£,i)eL 
respectively. 

The  standing  assumptions  throughout  the  paper  are: 
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a)  r.fj)  >  0,  i,j  =  i  t  j 

b)  Each  function  is  defined  on  an  interval  [D,^)  where  C .  is  either  a 

positive  number  (the  link  capacity)  or  ■*»;  is  twice  continuously 

differentiable  on  (0,C.  9).  The  first  and  second  derivatives  of  D.. 

i*-  is, 

at  zero  are  defined  by  taking  the  limit  from  the  right.  Furthermore 
is  convex,  continuous,  and  has  strictly  positive  first  and 
second  derivatives  on  [0,C. J. 

c)  (MFP)  has  at  least,  one  feasible  solution-  f  satisfying  F  < 
for  all  (i,£)eL. 


For  notational  convenience  in  describing  various  algorithms  we  will 
suppress  in  what  follows  the  destination  index  and  concentrate  on  a  single 
destination  chosen  for  concreteness  to  be  node  N.  Our  definitions,  opti¬ 
mality  conditions,  and  algorithms  are  essentially  identical  for  each 
destination,  so  this  notational  simplification  should  not  become  a  source 
of  confusion.  In  the  case  where  there  are  multiple  destinations  it  is 
possible  to  implement  our  algorithns  in  at  least  two  different  ways. 

Either  iterate  simultaneously  for  all  destinations  (the  "all-at-once11 
version) ,  or  iterate  sequentially  one  destination  at  a  time  in  a  cyclic 
manner  with  intermediate  readjustment  of  link  flows  (the  t>one-at-a-time,, 
version).  The  remainder  of  our  notation  follows  in  large  measure  the  one 
employed  in  [1].  In  addition  all  vectors  will  be  considered  to  be  column 
vectors,  transposition  will  be  denoted  by  a  superscript  T,  and  the  standard 
Euclidean  norm  of  a  vector  will  be  denoted  by  j»|  ,  i.e.  x  x  =  |x|  for 
any  vector  x.  Vector  inequalities  ere  meant  to  be  componentwise,  i.e.  for 
x  =  (Xj,...,xn)  we  write  x  _>  0  if  x.^  >_  0  for  all  i  =  l,...,n. 

Let  be  the  total  incoming  traffic  at  node  i 


t. 

i 


=  r.  ♦  )  f 

1  mel(i)  01 
mj*N 


i  =  1, 


,N-1, 


U) 
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Furthermore  the  functions  t(4>,r),  f($,r)  are  twice  continuously  dif¬ 
ferentiable  in  the  relative  interior  of  their  domain  of  definition 
$x{r|r  >  Oh  The  derivatives  at  the  relative  boundary  can  also  be  defined 
by  taking  the  limit  through  the  relative  interior.  Furthermore  for  every 
r  >  0  and  every  f  which  is  feasible  for  (MFP)  there  exists  a  such 
that  f  =  f($,r). 

It  follows  from  the  above  discussion  that  the  problem  can  be  written 
in  terms  of  the  variables  as 

minimize  D C4>»r)  =  ^  D.  ?  [f.  „  (<J,r)]  (3) 

Ci>*)eL 

subject  to 

where  we  write  D($,r)  =  <°  if  £^(4>,r)  >_  for  some  (i,A)el. 

Similarly  as  in  [1],  our  algorithms  generate  sequences  of  loopfree 
routing  variables  4>  and  this  allows  efficient  computation  of  various 
derivatives  of  D.  Thus  for  a  given 

$e<I>  w  »  sa>  that  node  k  is  downstream  from  node  i  if  there  is  a  directed 

path  from  i  to  k,  and  for  every  link  (£,m)  on  the  path  we  have  <p„  >0. 

m 

We  say  that  node  i  is  upstream  from  node  k  if  k  is  downstream  from  i.  We 
say  that  $  is  loopfree  if  there  is  no  pair  of  nodes  i,k  such  that  i  is 
both  upstream  and  downstream  from  k.  For  any  and  r  >  0  for  which 
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;$,r)<  «>  the  partial  derivatives  can  be  computed  using  the  follow¬ 


ing  equations  [1] 


■Jjp—  =  t.  (D! .  +  -sr--),  (i>£)£i>  i  -  l,-*-»N-l 

3<Pi  o  1  1*.  dTj 


J  srJ-  1 

£eff(x)  ^ 


where  D!  ^  denotes  the  first  derivative  of  with  respect  to  f.  The 

r\n  on 

equations  above  uniquely  determine  — r- — *  and  -rj—  and  their  computation 

is  particularly  simple  if  <*>  is  loopfree.  In  a  distributed  setting  each 

node  i  computes  and  via  (4) ,  (S)  after  receiving  the  value  of 

_  d*iZ  ^i  . 

from  all  its  immediate  downstream  neighbors.  Because  (j>  is  loopfree 

3r£ 

the  confutation  can  be  organized  in  a  deadlock-free  manner  starting  from 
the  destination  node  N  and  proceeding  upstream  [1] . 

A  necessary  condition  for  optimality  is  given  by  (see  [1]) 


3D 

=  min 
mEO(i) 

3D 

3<5>. 

vim 

3D 

>  min 
msO (i) 

3D 

30. 

vim 

if  4>u  >  0 


if  4>-o  =  °» 


where  all  derivatives  are  evaluated  at  the  optimum.  In  view  of  (4, ,  this 


condition  can  be  written  for  t.  >  0 


D!  0  +  =  min  CD!  +  if  <J>.  0 

l£  *£  meO(i)  1111  *Y 


D !.+■=-  >  min  (D{  +  if  fc  . 

l£  *1  “  meO(i)  im  l£ 


Combining  these  relations  with  (5)  we  have  that  if  t^ 


3D  .  - 

—  =  mm  o. 

oi*  •  ~  f  •  n  im 

i  meO(x) 


where 


<5.  =  D!  + 

lra  im  3r 


V  meO  (i  ) 


m 


In  fact  if  (6)  holds  for  all  i  (whether  t.  =  0  or  t^ 
ficient  to  guarantee  optimality  (see  [1],  Theorem  3). 
We  consider  the  class  of  algorithms 


k+l  k  k 

<*>*  =  <j£  +  ,  i  =  1, . • . ,N  -1 


k  1 

where,  for  each  i,  the  vector  AcJk  with  components  A<->, 

solution  of  the  problem 

minimize  oTa4>-  +  —  A<{>Tm^A<{>- 
i  Ti  2a  x  l  i 


subject  to  <p.  +  A $■.  >  0,  T  &<p.B  =  0, 
x  -  z  ix. 

&<Piz  *  o,  v  £eB(i;$k). 


The  scalar  a  is  a  positive  parameter.  The  vector  6. 
[cf.  (7)] 

6i£  =  DU  +  Hr  ’ 


>  0 

=  0. 

^  0  then 

(6) 

(7) 

>  0)  then  it  is  suf- 

(8) 

■lS,  £c0(i)  is  any 

(9) 

has  components 


V  £sO(i) 
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k  k  T  t 

where  all  derivatives  are  evaluated  at  d>  and  f(§  ,r).  and  5^  (0r  £$!)  denotes 

transpose  of  6^  (or  ) .  For  each  i  for  which  t^($^,r)  >  0,  the  matrix 

is  some  symmetric  matrix  which  is  positive  definite  on  the  subspace 

^  I  1  v .  =  0  },  i .  e . 

£e0(i)  1 

vTm^v.  >  0,  v  v.  ?  0,  T  x,.  =  n 

ill  x  L  'it 

£s0(i)  * 

This  condition  guarantees  that  the  solution  to  problem  (9)  exists  and  is 

unique.  For  nodes  i  for  which  t.  (i^.r)  =  0  the  definition  of  is  immaterial. 

k 

The  set  of  indices  B(i;p  )  is  specified  in  the  following  definition: 


Definition:  For  any  4>e$>  and  i=l,..,i&-l  the  set  B(i;$),  referred  to  as 
the  set  of  blocked  nodes  for  o  atji,  is  the  set  of  all  £eO(i)  such  that 
^i£  =  °*  311(1  either  —  _<  ,  or  there  exists  a  link  (ra,n) 

X  it 

referred  to  as  an  improper  link  such  that  a=£  or  a  is  downstream  of  £ 
and  we  have  $  >0, 

tw  Sr  —  Sr 


m 


3r 


n 


k  k+1 

It  is  shown  below  that  if  is  loopfree,  then  tp  generated  by  the 

algorithm  is  also  loopfree.  Thus  the  algorithm  generates  a  sequence  of  loop- 

free  routings  if  the  starting  q>°  is  loopfree.  We  refer  to  [1]  for  a 

description  of  the  method  for  generating  the  sets  B(i;<&  )  in  a  manner 

V 

suitable  for  distributed  computation.  Gur  definition  of  B(i;<ji  )  differs 
from  the  one  of  [1]  primarily  in  that  a  special  device  that  facilitated  the 
proof  of  convergence  given  in  [1]  is  not  employed  (compare  with  equ.  (IS)  of  [1]). 


We  new  demonstrate  some  of  the  properties  of  the  algorithm  in  the 
following  proposition. 

k  j 

Proposition  1:  a)  If  g  is  loopfree  then  is  loopfree. 

k  k  k 

b)  If  p'  is  loopfree  and  a§‘  =  0  solves  problem  (9)  then  <*>  is  optimal. 


c)  If  <p  is  optimal  then  <p  is  also  optimal. 

k  k 

d)  If  ^  0  for  some  i  for  which  t^(<J>  ,r)  >  0  then  there  exists  a 
positive  scalar  such  that 

D(4>k  +  nA4>k,r)  <  D(<j>k,r),  vneCO.r^]. 


k+ 1 

Proof:  a)  Assume  that  4>  is  not  loopfree  and  there  exists  a  sequence 

k+1 

of  links  forming  a  directed  loop  a!'-  "ich  <)>  is  positive.  Then 

lc 

there  must  exist  a  link  (m,n)  cn  the  loop  for  which 

m  n 

k  lc 

From  the  definition  of  B(m;<f>  )  we  must  have  <{>  >  0  and  hence  (m,n)  is  an 

improper  link.  Now  move  backwards  around  the  loop  to  the  first  link 

k  k 

(i,S.)  for  which  =  0.  Such  a  link  must  exist  since  <f>  is  loopfree. 

k 

Since  l.  is  upstream  of  m  and  (ra,n)  is  improper,  we  have  2£B(i;<j>  )  which 

k+1 

contradicts  the  hypothesis  <}>££  >  0. 

k  T 

b)  If  A<t>  =  0  solves  problem  (9)  then  we  must  have  d.A^  >  0  for  each 

i  and  A^  satisfying  the  constraints  of  (9) 


A4>i  >  I  d<j)i£  =  0,  A<f»ijl  =  0, 


V teB(i;4>k) . 


By  writing  A<Jk  =  and  using  (5),  (7)  we  have 


=  y  6  4.  .  >  0. 

L  jJTU  8r.  - 
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By  considering  ^  =  1  individually  for  each  Ci ; <J>K) ,  we  obtain 


§7  1  «u  »  V£  *  B(i;4,k). 


From  (5)  and  (7)  then 
3D 


3r.  =  5U  ’  BdsA.with  4  >  0. 


Since  D!  >  0  for  all  (i,£,)eL  it  follows  from  (5) ,  (7)  and  the  relation  above  that 

1  /C 

there  are  no  improper  links,  and  using  the  definition  of  B(i;<b  )  we 
obtain 


min  6 
£e0(i) 


which  is  a  sufficient  condition  for  optimality  of  $  [cf.  (6)]. 

c)  If  <j>  is  optimal  then  from  the  necessary  condition  for  optimality  (6)  we 

have  that  for  all  i  with  t^  >  0 


min 

riEO(i) 


6. 

lm 


It  follows  using  a  reverse  argument  to  the  one  in  b)  above  that 
A<{>k  =  0  if  t.  >  0. 

l  l 


Since  changing  only  routing  variables  of  nodes  i  for  which  t^  =  0  does 

R  k*l 

not  affect  the  flow  through  each  link  we  have  D((f>  ,r)  »  D(<J>  ,r)  and 

,k+l  .  . 

<p  is  optimal. 

k  k 

d)  Since  is  positive  semidefinite  for  all  i  with  t^  >  0  and  A$i  is  a 

solution  of  problem  (9)  we  have 

6rA4>k  <  0 

i.  l  - 
k 

If  t.  >  0  then  is  positive  definite  on  the  appropriate  subspace  and 
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_  j, 

ae(0,a]  and  k=0,l,...  the  sequence  [<j>  }  generated  by  algorithm  (8) ,  (9) 
satisfies 

k+l  k  k 

D(<J>  ,r)  _<  D C<J>  > r)  ,  lim  D(iJ  ,r)  =  min  D(4>, r). 

k-*»  4>e$ 


^  1c 

Furthermore  every  limi.t  point  of  {$  }  is  an  optimal  solution  of  problem 


(3). 


Another  interesting  result  which  will  not  be  given  here  but  can  be 
found  in  [11]  states  that,  after  a  finite  number  of  iterations,  improper 
links  do  not  appear  further  in  the  algorithm  so  that  for  rate  of  con¬ 
vergence  analysis  purposes  the  potential  presence  of  improper  links 
can  be  ignored.  Based  on  this  fact  it  can  be  shown  under  a  mild 
assumption  that  for  the  single  destination  case  the  rate  of  convergence 
of  the  algorithm  is  linear  [11]. 

The  class  of  algorithms  (8) , (9)  is  quite  broad  since  different 
choices  of  matrices  m!^  yield  different  algorithms.  A  specific  choice 

of  yields  Gallager's  algorithm  [1]  [except  for  the  difference  in 

k 

the  definition  of  B(i;$  )  mentioned  earlier].  This  choice  is 

the  one  for  which  :i.s  diagonal  with  all  elements  along  the  diagonal 
being  unity  except  the  (£.,£) th  element  which  is  zero  where  it  is  a  node 
for  which 


min  5 
JteO  (i) 


We  leave  the  verification  of  this  fact  to  the  reader.  In  the  next 
section  we  describe  a  specific  algorithm  involving  a  choice  of  based 
on  second  derivatives  of  D^.  The  convergence  result  of  Proposition  2 
is  applicable  to  this  algorithm. 
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3.  An  Algorithm  Based  on  Second  Derivatives 


A  drawback  of  the  algorithm  of  [1]  is  that  a  proper  range  of  the 
stepsize  parameter  a  is  hard  to  determine.  In  order  for  the  algorithm 
to  have  guaranteed  convergence  for  a  broad  range  of  inputs  r,  one  must 
take  a  quite  small  but  this.vdll  lead  to  a  poor  speed  of  convergence  for 
most  of  these  inputs.  It  appears  that  in  this  respect  a  better  choice 
of  the  matrices  can  be  based  on  second  derivatives.  This  tends  to 
make  the  algorithm  to  a  large  extent  scale  free,  and  for  most  problems 
likely  to  appear  in  practice,  a  choice  of  the  stepsize  a  near  unity 
results  in  both  convergence  and  reasonably  good  speed  of  convergence  for 
a  broad  range  of  inputs  r.  This  is  supported  by  extensive  computational 
experience  some  of  which  is  reported  in  [5]  and  [6]. 

We  use  the  notation 

0„  . 

11  t«u)2 

We  have  already  assumed  that  is  positive  in  the  set  [0,0^).  We  rould  1,'ke 

2  k 

to  choose  the  matrices  to  be  diagonal  with  t~2  — — 1 along  the 

1  1  Wu] 

diagonal.  This  corresponds  to  an  approximation  of  a  constrained  version 
of  Newton's  method  (see  [3]),  where  the  off-diagonal  terms  of  the  Hessian 
matrix  of  D  are  set  to  zero.  This  type  of  approximated  version  of 
Newton's  method  is  often  employed  in  solving  large  scale  unconstrained 
optimization  problems.  Unfortunately  the  second  derivatives  - =■ 

are  difficult  to  compute.  However,  it  is  possible  to  compute  easily 
upper  and  lower  boupds  to  them  which,  as  shewn  by  computational  ex- 


1 


periments,  are  sufficiently  accurate  for  practical  purposes. 
Calculation  of  Upper  and  Lower  Bounds  to  Second  Derivatives 


We  compute  - -  evaluated  at  a  loopfree  $£$,  for  all  links 

t3*ur 

(i  ,Z)eL  fo  -.j.ch  ££B(i;4>).  We  have  using  (4) 


{tilD“  *  '' 


Since  2./£B(i ; tf»)  and  <J>  is  loopfree,  the  node  2.  is  not  upstream  of  i.  It 
3t.  3D.' 

follows  that  v.  1  =  0  and  -rr-—  *  D'.'  t. .  Using  again  the  fact  that  2-  is 
3$,  9  d<t>.giZi 

3t.lx  3D!. 

not  upstream  of  i  we  have  *  0,  — *  0  and  it  follows  that 

if  if 

•lS _  .  JL  lt  a-  .  SL ,!  „  t  a2|)  • 

*U8rt  **  I  i(  u  **  ’l  1  [sr,]2 

Thus  we  finally  obtain 


[3+uI2  [3r,]2 


A  little  thought  shows  that  the  second  derivative  - =—  is  given  by 

r  rv _  1  * 


the  more  general  formula 


[*»]' 


3Vrm 


I  q..  (£)q..  (m)D»  V  l  ,m=l, . . .  ,N-1 
(j,k)el  Jk  3k 


where  q.,  (2.)  is  the  portion  of  a  unit  of  flow  originating  at  l  which 

32D 

goes  through  link  (j,k).  However  calculation  of - using  this 

Ur,)2 
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formula  is  complicated,  and  in  fact  there  seems  to  be  no  easy  way  to 
compute  this  second  derivative.  However  upper  and  lower  bounds  to  it 
can  be  easily  computed  as  we  now  show.  By  using  (5)  we  obtain 


77  ■  ^ 

[Sr^J  Am  m 


Since  $  is  loopfree  we  have  that  if  $  >  0  then  a  is  not  upstream  of 

3tA  3DJfc 

A  and  therefore  —~r  -  1  and  — —  =  D'*  A  similar  reasoning  shows 

or£  dr£  A®  *>“ 

that 


3  D  _  3  i-r  .  ,n,  .  3D  <.1  r  ,  3  D 

3rff3r  3r  ^An^  An  3r  ^  ^An  3r  3r 

Am  mn  n  n  m  n 


Combining  the  above  relations  we  obtain 


[3T,] 


2  '  ^  ^Am  DAm  +  ^  l  *Am  ^An  l/sr  * 


2  2 

since  Its it  i  °» by  settin«  ww 

m  n  m  n 


to  zero  for  mj*n  we  obtain  the 


lower  bound 


By  applying  the  Cauchy-Schwartz  inequality  in  conjunction  with  (12)  we 
also  obtain 


3r  3r 
m  n 


<  ,/S I 

~  1  rar  1 


[ar,]2  [Sr/ 


Using  this  fact  in  (13)  we  obtain  the  upper  bound 


l  * 


u 


2m 


%  + 


£d _ .2 

2  * 


It  is  now  easy  to  see  that  we  have  for  all  k 

R.  <  <  R 

“  Ur/  ~  1 


where  and  are  generated  by 


R„  =  7  C  (D*»  +  R  ) 
-2,  L  £m  -nr 

m 


R*  *  l  4n  °£nj  +  <L  h*  W2 
m  m 


(14) 


(15) 

(16) 


The  computation  is  carried  out  by  passing  R  and  R  upstream  together  with 

***X 

SD 

~ —  and  this  is  well  suited  for  a  distributed  algorithm.  Upper  and  lower 
1  -  32D 

bounds  $.n,4>.  for  - - =-  ,  2^B(i;$)  are  obtained  simultaneoulsy  by 

_1  U  [8*i / 

means  of  the  equation  [cf.  (11)] 


$ 

— xZ- 

-  *  4> 

(17) 

¥ 

& 

■  *>»  *  V- 

(18) 

It  is  to  be  noted  that  in  some  situations  occuring  frequently  in  practice 
the  upper  and  lower  bounds  u  and  §  ^  coincide  and  are  equal  to  the  true 


second  derivative.  This  will  occur  if  =  0  for  m?*n.  Fox 

m  n 

example  if  the  routing  pattern  is  as  shown  in  Figure  1  (only  links  that 


Figure  1 

A  typical  case  where  ’?i l  *  ~il  and  the  discrepancy  affects  materially 
the  algorithm  to  be  presented  is  when  flow  originating  at  i  splits  and 
joins  again  twice  on  its  way  to  N  as  shown  in  Figure  2. 


Figure  2 


The  Algorithm 

The  following  algorithm  seems  to  be  a  reasonable  choice.  If  t^  /  0 

we. take  in  (91  to  be  the  diagonal  matrix  with  £eQ(j)  along  the 

diagona!  where  the  upper  bound  computed  from^'18)  and  (14) -(16)  and 

a  is  a  positive  scalar  chosen  experimentally.  (Inmost  cases  a=l  is  satisfactory.) 

Convergence  of  this  algorithm  can  be  easily  established  by  verifying  that 
the  assumption  of  Proposition  2  is  satisfied.  A  variation  of  the  method 
results  if  we  use  in  place  of  the  upper  bound  <{>. .  the  average  of  the 

A  +  A 

— i£ 

upper  and  lower  bounds  - ^ - .  This  however  requires  additional 

computation  and  communication  between  modes. 

Problem  (9)  can  be  written  for  t^  j*  0  as 


minimize 


I  tsu  *  Sr  «»u>2> 

£  l 


subject  to  A<j>u  >  A4>iA  =  0,  A<J>U  =  0  V£eB(i;<f>k) 

£ 


and  can  be  solved  using  a  Lagrange  multiplier  technique.  By  introducing 
the  expression  (18)  for  and  carrying  out  the  straightforward  calculation 
we  can  vrrite  the  corresponding  iteration  (8)  as 
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♦u1  =  max{°*4- 


a(6i£-u) 


VDU+V 


-} 


(20) 


where  y  is  a  Lagrange  multiplier  determined  from  the  condition 


.  a(6..-p) 

.  max{0,C. - — 

zmu*  )  ti(Di£+V 


l 


}  =  1. 


(21) 


The  equation  above  is  piecewise  linear  in  the  single  variable  y  and  is 
nearly  trivial  computationally.  Note  from  (20)  that  a  plays  the  role 
of  a  stepsize  parameter. 

It  can  be  seen  that  (20)  is  such  that  all  routing  variables  d>. 

XX 

such  that  *ll<»  will  be  increased  or  stay  fixed  at  unity,  while  all 
routing  variables  such  that  >  y  will  be  decreased  or  stay  fixed 
at  zero.  In  particular  the  routing  variable  with  smallest  6.£  will  either 
be  increased  or  stay  fixed  at  unity,  similarly  as  in  Gallager's  algorithm. 
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4.  An  Algorithm  Based  on  an  Upper  Bound  to  Newton's  Method 

While  the  introduction  of  a  diagonal  scaling  based  on  second 
derivatives  alleviates  substantially  the  problem  of  stepsize  selection, 
it  is  still  possible  that  in  some  iterations  a  unity  stepsize  will  not 
lead  to  a  reduction  of  the  objective  function  and  may  even  cause  divergence 
of  the  algorithm  of  the  previous  section.  This  can  be  corrected  by  using 
a  smaller  stepsize  as  shown  in  Proposition  2  but  the  proper  range  of 
stepsize  magnitude  depends  on  the  network  topology  and  may  not  be  easy  to 
deteirraine.  This  dependence  stems  from  the  replacement  of  the  Hessian 
matrix  of  D  by  a  diagonal  approximation  which  in  turn  facilitates  the 
computation  of  upper  bounds  to  second  derivatives  in  a  distributed  manner. 
Neglecting  the  off-diagonal  terms  of  the  Hessian  means  that  while  operat¬ 
ing  the  algorithm  for  one  destination  we  ignore  changes  which  are  caused 
by  other  destinations.  The  potential  difficulties  resulting  from  this 
can  be  alleviated  (and  for  most  practical  problems  eliminated)  by  operat¬ 
ing  the  algorithm  in  a  "one-at-a-time"  version  as  discussed  in  Section  2. 
However  the  effect  of  neglecting  the  off-diagonal  terms  can  still  be 
detrimental  in  some  situations  such  as  the  one  depicted  by  Figure  3.  Here 

r,  =  r-  =  r„  =  r.  >  0,  rc  =  r,  =  0  and  node  7 

1  2  j  4  5  6 
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is  the  only  destination.  If  the  algorithm  of  the  previous  section  is 
applied  to  this  example  with  a =1,  then  it  can  be  verified  that  each  of 
the  nodes  1,2,3  and  4  will  adjust  its  routing  variables  according  to 
what  would  be  Newton’s  method  if  all  other  variables  remained  unchanged. 

If  we  assume  symmetric  initial  conditions  and  that  the  first  and  second 
derivatives  DI7>  Djj7  and  D^7,  Dj.'7  are  much  larger  than  the  correspond¬ 

ing  derivatives  of  all  other  links,  then  the  algorithm  would  lead  to  a 
change  of  flow  about  four  times  larger  than  appropriate.  Thus  for 
example  a  value  of  o  =  1/4  is  appropriate,  while  ct=l  can  lead  to 
divergence . 

The  algorithm  proposed  in  this  section  bypasses  these  difficulties 
at  the  expense  of  additional  computation,  per  iteration.  We  show  that  if 
the  initial  flow  vector  is  near  optimal  then  the  algorithm  is  guaranteed  to 
reduce  the  value  of  the  objective  function  at  each  iteration  and  to  con¬ 
verge  to  the  optimum  with  a  unity  stepsize.  The  algorithm  "upper  bounds" 
a  quadratic  approximation  to  the  objective  function  D.  This  is  done  by 
first  making  a  trial  change  in  the  routing  variables  using  algorithm 
(8) , (9) .  The  link  flows  that  would  result  from  this  change  are  then,  calcu¬ 
lated  going  from  the  "most  upstream"  nodes  downstream  towards  the  destination. 
Based  on  the  calculated  flows  the  algorithm  "senses"  situations  like  the  one  in 
Figure  3  and  automatically  "reduces"  the  stepsize.  We  describe  the  algo¬ 
rithm  for  the  case  of  a  single  destination  (node  N) .  The  algorithm  for 
the  case  of  more  than  one  destination  consists  of  sequences  of  single 
destination  iterations  whereby  all  destinations  are  taken  up  cyclically 
(i.e.  the  one-at-a-time  mode  of  operation). 
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Tne  Algorithm 

At  the  typical  iteration  of  the  algorithm  we  have  a  vector  of  loop- 
free  routing  variables  $  and  a  corresponding  flow  vector  f.  We  first 
carry  out  iteration  (8),  (9)  with  the  choice  of  described  in  Section 
3  and  a  unity  stepsize,  and  obtain  a  trial  increment  of  routing  variables 
denoted  by  A4>*.  Based  on  A<j>*  we  calculate  the  new  (and  final]  increment  of 
routing  variables  AifT  and  the  new  routing  vector 

?  =  <J>  +  A?  (22) 

by  means  of  a  procedure  of  tho  following  type.  Each  node  i  computes  the 
corresponding  vector  of  routing  variable  increments  A?L  by  solving  a 
problem  of  the  form 

minimize  Q.  (A§.)  (23) 

subject  to  the  constraints 


A*u  >  0  if  ^ 

>  0 

(24a) 

A$i£  i  0  ifA$i£ 

<  c 

(24b) 

**i£  =  ° 

=  0 

(24c) 

t> 

O 

M' 

*>*» 

II 

O 

(24d) 

i  ° 

(24e) 

where  Q^(A<^)  is  a  quadratic  function  of  A$j  which  depends  on  o  and 
and  will  be  defined  shortly.  Notice  that  the  constraint  £24)  guarantees 
that  the  new  vector  of  routing  variables  §  is  loopfree.  In  what  follows 
we  describe  the  procedure  and  rationale  for  obtaining  the  fora  of  the 
quadratic  function  Q.  of  (23),  and  show  that  all  computations  can  be 


0 
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carried  out  in  a  distributed  manner. 

Let  Af  denote  an  increment  of  flow  such  that  f  +  Af  is  feasible. 

A  constrained  version  of  Newton's  method  [3]  is  obtained  if  Af  is  chosen 
to  minimize  the  quadratic  objective  function 


N(Af) 


D!  Af. 
U  U 


i,£ 


DU(A*W' 


(25) 


subject  to  f  +  AfeF  where  F  is  the  set  of  all  feasible  flow  vectors. 
Let  A<{>  be  the  change  in  <j>  that  corresponds  to  Af.  We  write 


=  <f>  +  A<J)  . 

Finally  let  t  and  At  be  the  vectors  of  total  incoming  traffic  at  the 
network  nodes  and  corresponding  changes  [cf.  (1)].  Then  we  have 


**1  ■  pfu>  '26> 

A£«  ■  *  ‘t»U-  (27) 


Substi citing  (27)  in  (25)  we  obtain 


N(A£)  . 


(28) 


*  7  £  *  aVu.V*u  *  W") 


\ 
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°U-  DU*Di'llV*U 
“i  *  l  ■°k  =  0 

By  multiplying  (30)  by  At  ,  summing  over  i  and  using  (27)  we  obtain 


(29) 

(30) 


ift 


1  = 

’  l  “i4ti  '  |  AVi  *  ,1  h**i  *°£ 

•  I  t.  Ad).  _D' 

iU  1  11 1 

By  using  this  relation  together  with  (29)  we  can  write  (28)  as 

nc«  =  |  q  V4,u(Di([  *  d.) 


*??  Du[(4tiW2*  "i%P2l> 


(31) 


Now  if  (At^)  were  available  then  we  could  conceive  of  a  recursive  scheme 
whereby  node  i  would  obtain  the  vector  A<Jk  which  minimizes  the  correspond¬ 
ing  term  in  the  right  hand  side  of  (31)  after  receiving  the  value  of 
from  its  downstream  neighbors  and  in  fact  it  can  be  seen  that  such  a 
computation  can  be  carried  out  in  distributed  fashion  starting  from  the 

destination  and  proceeding  upstream  similarly  as  for  algorithm  (8),  (9). 

2 

Unfortunately  (At.)  depends  on  the  values  of  A4>  for  nodes  m  that  lie 
l  m 

upstream  of  i.  To  bypass  this  difficulty  we  develop  in  what  follows  an 

_  'J 

upper  bound  for  the  troublesome  term  V  D'.’  (At . <j> . . )  by  making  use  of 
the  increment  A$*  obtained  through  an  iteration  of  algorithm  (8) ,  (9) .  When 


this  upperbound  is  substituted  in  (31)  we  will  obtain  an  upper  bound  to 


-28-  * 


N(Af)  of  the  form 

N(Af)  <  l  t.Q.^.) 
1 


where  Q.  (a<J>.)  is  precisely  the  expression  to  be  used  in  the  algorithm 
[cf.  (23)]. 


Let  us  define  for  all  i  =  1,...,N-1,  (i,£)e(-  and  A<{>  satisfying  the 

constraint  (24) 

A<|>|£ 

=  max(0}A<J>?^)  ,  Ad>*£  =  |min(0,A<{)*^)  j 

(32) 

=  max(0,A4>i£)  ,  A<j>"^  =  |min(0,A<J>u)  I 

(33) 

(34) 

At?" 

l 

•  |  IV+li  *  At!’wti 

(35) 

At! 

l 

* 

(36) 

At: 

i 

(37) 

The  quantities  At?+,  At?"  are  well  defined  by  virtue  of  the  fact  that  the 
set  of  links 


L*  =  ( (i,£)ei|  >  0,  or  <j>i£  +  A^  >  0} 

forms  an  acyclic  network  [in  view  of  the  manner  that  the  sets  of  blocked 
nodes  B  ( 4>;i)  are  defined  in  algorithm  (8),  (9)].  As  a  result  At?+  and 
At?"  are  zero  for  all  nodes  i  which  are  the  -'most  upstream"  in  this 
acyclic  network .  Starting  from  these  nodes  and  proceeding  downstream 
the  computation  of  At?+  and  At?  can  be  carried  out  in  a  distributed 
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where  from  each  summation  above  we  exclude  all  nodes  £  for  which  the  corresponding 
denominator  [and  hence  also  the  numerator  by  (24)  and  Letmna  1]  is  zero. 


Similarly  we  obtain 


where  the  summation  in  (41)  [(42)]  is  over  all  nodes  £  such  that  f  0 
(fiij>?~  i  0).  Define  also 

D||+  =  0  ,  DJI"  =  0.  (43) 


Notice  that  given  A^.^  and  Dy+  for  all  downstream  neighbors  £  ic  is  possible 

*J*  a 

for  node  i  to  compute  DV  and  DV  .  Thus  for  each  A<j>  satisfying  (24)  the 
quantities  DV  ,  DV~  are  well  defined  and  can  be  computed  recursively  start¬ 
ing  from  the  destination  N  and  proceeding  upstream  in  a  distributed  manner. 
The  following  proposition  yields  the  desired  upper  bound. 

Proposition  5:  Under  the  constraint  (24)  we  have 

N(Af)  <  l  t.Qi(A<j>i)  (44) 

i 

where 


=  I  [Wit  *  Di)»M  *  lev?*  *  6U>  WU)21 


(45) 


and 


SU 


0 
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i£  >  0 

if  At|>*£  <  0 


if  t*it  -  0 


(46) 


Proof :  In  view  of  (31)  it  will  suffice  to  show  that 


£  °u“Vu>2  i 


(47) 


From  (38)  we  have 


X  Du'4Vu>2  ■  I(“i>2  ?  D’u(*it,: 


(48) 


i,i 


<  I  (Att)2  l  DJt(*u)2  ♦  I  (At")2  J  DV^U)2. 

l)o  l  Ki 


For  all  i  with  Att  >  0  we  have,  using  Lemma  1,  At|+  >  0  so  by  dividing  (41) 
by  At?  we  obtain 


l  Uti>2 \  Vhi>2  *  l  t4tl> 


■T  .2 


+»  2 


DV 

l 


I 


°X 


.  +  t  .+ 
At*  i  At? 


*U  + 


,  +  ,2 
^A<f>-  0) 


A4»: 


u 


(At*)2  D'.,+ 

J  —7— 1  -  J  Df 

i  At?  i,A  x 


toti)2*u  .  (Kj2»*u>2  1 

_+  .  +  .  _+ 


At? 


At£-  A*?, 


By  using  (39)  we  obtain 
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Ic<)2  S  J 


(Att)2DV+ 

Atf 


I 

£ 


(*1>V ,  i  <*Mi!2 


i  D’iV^2 

i,l  A(J)|^ 


i,£ 


(49) 


Similarly  we  obtain 
2 


I  (At“/  J  D"  (♦-  J2  <  I 
i  1  £  1X  13t  i,  £ 


“ft 


(50) 


By  combining  (48) -(50)  and  using  the  constraint  (24)  we  obtain  the  desired 
relation  (47) .  Q.E.D. 

The  algorithm  can  now  be  completely  defined.  After  the  routing  in¬ 
crement  A4>*  is  calculated  in  a  distributed  manner  by  means  of  algorithm 
(8) ,(9),  each  node  i  computes  the  quantities  Att+  and  At?”.  This  is  done 
recursively  and  in  a  distributed  manner  by  means  of  equations  (34) ,  (35) 
starting  from  the  "most  upstream"  nodes  and  proceeding  downstream  towards 
the  destination.  When  this  downstream  propagation  of  information  reaches 
the  destination  indicating  that  all  nodes  have  completed  the  computation 
of  At?+  and  At?  ,  the  destination  gives  the  signal  for  initiation  of  the 
second  phase  of  the  iteration  which  consists  of  computation  of  the  actual 
routing  increments  AcjL .  To  do  this  each  node  i  must  receive  the  values 
of  Dj^,  D£+,  and  D£”  fro£  its  downstream  neighbors  £  and  then  determine 
the  increments  A^.^  which  minimize  Q^(A^)  subject  to  the  constraint  (24) 
and  the  new  routing  variables 

\ £  =  *i£  +  A*i£* 

Then  node  i  proceeds  to  compute  D!,  DV+,  and  DV"  via  (30),  (41),  and  (42) 


and  broadcasts  these  values  to  all  upstream  neighbors.  Thus  proceeding 
recursively  upstream  from  the  destination  each  ncde  computes  the  actual 


routing  increments  A<^  in  ouch  the  saae  way  as  the  trial  routing  increments 
A<*>?  were  computed  earlier. 

It  is  shown  in  Appendix  B  that  if  the  starting  flow  vector  f°  is 
sufficiently  close  to  being  optimal  then  the  algorithm  just  described 
reduces  the  value  of  the  objective  function  at  eacn  iteration  and  con¬ 
verges  to  the  optimal  value.  We  cannot  expect  to  be  able  to  guarantee 
theoretical  convergence  when  the  starting  routing  variables  are  far  from 
optimal  since  this  is  not  a  generic  property  of  Newton's  method  which  the 
algorithm  attempts  to  approximate.  However  in  a  large  number  of  com¬ 
putational  experiments  with  objective  functions  typically  arising  in  com¬ 
munication  networks  and  starting  flow  vectors  which  were  far  from  optimal 
[5]  we  have  never  observed  divergence  or  an  increase  of  the  value  of  the 
objective  function  in  a  single  iteration.  In  any  case  it  is  possible  to 
prove  a  global  convergence  result  for  the  version  of  the  algorithm  whereby 
the  expression  Q-fAffr.)  is  replaced  by 


Q“(A$J  =  l 

xx  £ 


[CD! 


i& 


DP 


♦  (tiDU 


)2] 


(51) 


where  a  is  a  sufficiently  small  positive  scalar  stepsize.  In  Appendix  B 
we  show  that  by  choosing  a  sufficiently  small  it  is  possible  to  guarantee 
a  reduction  of  the  objective  function  at  each  iteration  for  any  starting 
point  4>° s$.  This  fact  can  be  used  to  prove  a  convergence  result  similar 
to  the  one  of  Proposition  2. 


Appendix  A:  Proof  of  Proposition  2 

The  proof  of  Proposition  2  to  be  given  in  this  appendix  applies  to 
the  "all-at-once"  version  of  algorithm  (8), (9),  i.e.  the  one  where  at  each 
iteration  k  every  node  i  solves  problem  (9)  for  all  destinations  j  and 
adjusts  the  corresponding  routing  variables  according  to  (8).  A  nearly 
identical  proof  applies  to  the  "one-at-a-time"  version  (see  Gafni  [11]). 

The  destination  of  flows,  routing  variables,  etc.  will  be  denoted  within 
parentheses.  Thus  for  example  (^(j)  denotes  the  routing  variable  of  linl: 
(i,£)  for  destination  j. 

The  following  lemma  bears  close  similarity  in  both  statement  and 
proof  as  Lemma  5  of  (iallager  [1] .  The  proof  vili  be  omittlr  ,  but  may  be 
found  in  [11]. 

Lemma  A.l:  Let  the  assumptions  of  Proposition  2  he) a.  There  exists  a 
scalar  ae(0,l]  (depending  on  Dq,  X,  and  A)  such  ’•.hat,  for  every  ae(0,a], 
the  corresponding  sequence  {p  }  generated  by  algorithm  (8) , (9)  satisfies 

D(*k+\r)-DM>k,r)  <  -p[  [tk(j)]2U$k(j)  |2,  k=0,l,...  (A.l) 

lim  tk(j) |A0k(j)|  =  0,  Vi,j*l,2,...,N,  i  j*  j  (A.2) 

k-+«  1  1 

lim  l^(j)  -  fja(j)|  -  0,  V(i ,mL,  i,j^,2,...,N,  i  *  j  (A.3) 

v 

where  p  is  some  positive  scalar  (depending  on  a,  0q,  X,  A),  t^fj)  denotes 
the  total  traffic  arriving  at  node  i  which  is  destined  for  j  when  the 
routing  is  4>k,  A<{>k(j)  =  ^+1(j)-4>k(j)»  and  fj£(j),  are  the  flows 
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!c  K+ 1 

on  link  (i,£)  destined  for  j  and  corresponding  to  <f>  ,  <j>  respectively. 

The  following  learn?  provides  a  key  fact. 

_  _ 

Lemma  A. 2:  If  ae(0,a]  where  a  is  as  in  Leisna  A.l  and  {d>  }  is  a  correspond¬ 
ing  sequence  generated  by  algorithm  (8) , (9)  there  holds 

lia  [A^(j)  -  A*(j)]  =  0  ,  Vi.j  =  1.....N,  i  jt  j,  (A.4) 

k-*®  1  1 


where  for  all  i,j,k 

A^(j)  =  max{^(j) \i  e  0(i),  $L*X(j)  >  0} 


ri£ 


(A.  5) 


A^(j)  =  min  {<££(j)  |  £  e  0(i),  UB(i,<!,k)(j)} 

jL 


-  ♦  zuteL. . 

l£  l£  3r.(j) 


(A.6) 
(A.  7) 


Proof:  From  a  necessary  condition  for  optimality  for  problem  (9)  we  obtain 

tlC(j) 

[6iCj)  +  "IT  -  4>i+1Cj)]  >0  (A.8) 


for  all  <|>i (j)  which  are  feasible  in  problem  (9). 


Let  £  and  £  be  such 


that 


6i£(j)  =  AiCj)'  6i£Cj)  =  4Cj)* 


If  £  j-  _£  ws  define  $?(j)  to  be  the  vector  with  components 


♦i£«> 


,k+l,..  _ 
*i£  Cj)-£ 

if  £  =  £ 

♦uOJ46 

if  £  *  £ 

♦u  0) 

otherwise 
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k+1  — k 

where  e>0  is  small  enough  so  that  (j)-e>0.  By  definition  of  A?(j) 

k+1 

such  an  e  exists  and  by  feasibility  of  (j)  we  have  that  is  also 

feasible.  Substituting  $?(j)  in  (A. 8)  in  place  of  we  obtain 

ef^O*)  -  1  f  [W^(j)  -  liJjO)] 

where  P*£(j)  and  416  the  h.  311(1  ^  elements  of  the  vector  t^(j)M^(j)A<jn  (j) . 

Using  the  assumption  that  all  elements  of  M^(j)  are  bounded  above  by  A 


we  obtain 

0  <  £(j)  -  A*(j)  <  £  t*(j)  l  |  A*J£(j)|. 

This  relation  holds  also  if  £  =  £  since  then  AJ(j)  =  A^(j).  From  (A. 2) 
we  see  that  the  right  hand  side  tends  to  zero.  Equation  [A. 4)  follows.  Q.E.D. 

Given  any  set  of  routing  variables  <J>  e  4>  there  is  a  unique  correspond¬ 
ing  set  of  flows  f^fj).  If  we  view  the  first  derivative  Dl^Cf.^)  as  the 
length  of  link  (i,£)  then  the  corresponding  shortest  distance  from  any 
node  i  to  any  other  node  j  is  well  defined  and  will  be  denoted  by  S^.  (<J>) . 

It  is  easily  seen  using  equation  (6)  that  a  sufficient 

A 

condition  for  optimality  of  a  set  of  routing  variables  <J>  e  <J  is 


S  (A)  = 


V  .  ,j  =  1,.. .,N,  i  **  j . 


(A.  9) 


Furthermore  there  holds 


S-  - ($) 

xj 


<  3D(4>,r) 

-  “SrTCjj 


V  <}>  e  $  and  i,j  =  1,...,N,  i  f  j. 


(A.  10) 


We  have  the  following  lensaa: 

_  —  v 

Lemma  A. 3:  If  ae(0,a]  where  a  is  as  in  Lemma  A.l,  }  is  a  correspond¬ 
ing  sequence  of  algorithm  (8) ,  (9) ,  m  >  1  is  an  integer,  and  ft  is  an  infinite 
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k  k-m  ^ 

index  set  such  that  the  subsequences  {<{>  and  (4>  }  ^  converge  to  <f> 

and  <{>  respectively  then 


fiSU)($.r)  =  fii(j)(<J>,r),  v(i ,i)eL,  j  =  1.....N 


(A. 11) 


sij(«  =  Sj[ .  (<>)  ,  V  i,  j  -  1,2,...,  N 


(A. 12) 


Proof :  Equation  (A. 11)  follows  from  (A. 3),  and  equation  (A. 12)  follows 

from  the  fact  that  S..  (<M  depends  on  $  only  through  the  flows  f^0(j)  (4>,r) . 

Q.E.D. 

We  will  use  "two  dimensional  induction"  to  show  that  the  limit  of 

|r 

any  convergent  subsequence  of  {$  )  satisfies  the  sufficient  condition 
for  optimality  (A. 9).  Lemma  A. 4  that  follows  represents  the  basic  step 
of  the  induction  proof.  We  use  repeatedly  the  fact  that  if  some  property 
1  holds  for  all  k  with  k  >  k^  and  some  property  2  holds  for  all  k  with  k  >  k 
then  both  hold  for  all  k  with  k  >  max  (k^,  kp.  In  what  follows  we  will 
express  this  by  writing  "if  1  holds  for  all  k  large  enough  and  2  holds 
for  all  k  large  enough,  then  both  hold  for  all  k  large  enough". 

Lemma  A. 4;  Let  ae(Q,a]  where  a  is  as  in  Leiana  A.l,  let  {<j>  }  be  a 

corresponding  sequence  generated  by  algorithm  (8) , (9)  and  let 

k- 1  ~  k  A  k 

<J»and  {$  be  two  convergent  subsequences  of  {(j)  }.  For 

each  j  let  S.  C4>)  be  the  set  of  distances  {S..(<}>)|  i  e  N}.  Let 

S^(j)»*..»S  (j),  p  <  N  be  the  distinct  elements  of  the  set  S^.  C4>)  and 

assume  without  loss  of  generality  that  0  =  S^(j)  <  S2U)  <...<  S^Cj). 

Denote 

Iq(j)  =  tiJS..  C4>)  <  q  = 


(A.  13) 


Assume  that  for  some  integer  q  we  have: 

A 

3D(*»r)  -  3D(^.r)  _  e  rli  v  ;  t  ^  t  m 

a)  Sr.(jT  '  3r.(j)  ijt<W  Vl  qUJ’  3 

-  k-1 

b)  For  all  k  large  enough,  k  e  K,  and  for  any  j,  if  $  (j)  >0  and 

3D($k_1,r)  .  3D»k_1,r) 

-  E  *,&>  then  -  I^ST  >  3rni3  V"  • 

Then: 


fjfjf  =  S..M)  Viclqtl0),  j-l . N 

~  k 

b')  For  all  k  large  enough,  kefC,  and  for  any  j,  if  <|>-  (j)  >  0  said 

n£I  m  then  MIlLd  >  «Wk.r) 

“  £  q*lU)  then  3rn0)  3rn(j) 


Proof:  Let  i  be  such  that  i  e  Iq+1Cj) 


and  denote 


A.U)  =  U|S..  (4>)  =  D!£[fuC4>,T)]  +  S£.C«f£e  0(i)} 


Bv  the  definition  of  shortest  distance  we  have 


SijW)  <  DkI£i.iC<l>,r)3  +  S£j(<*°  VZ  1  V3)*  £e0(i)* 

Using  (A. 10)  and  the  above  equation 

V«  ‘  Didfu»'rJ  *  ffof 

or  equivalently 

SijC^  <fiiiC5)W»r)  v  £  <  ^(3).  ^O(i). 


By  the  assumption  Dj^  >  0  ^d  the  fact  i  e  I^+j(j),  we  have 


Therefore  by  using  hypothesis  a)  we  have 


=  Di£[fi£Ci»r)]  +  =  D»#[f.#(i,r)]  ♦  S#.(J)  (A.16) 


=  S. .  ($) 
13  r 


3rA(j)  i£l  i£^'J 

V  £££^(j),  3  —  • 


Since  0(i)  is  a  finite  set  there  exists  e  >  0  sfch  that 


5iw(j)(<*>,r)  -  e  >  6i£(j)C<j»,r)  v  w  £  ^(j),  weG(i),£  e  ZjCj),  j  = 

k-1 

Since  6^(j)(4>,r)  is  continuous  in  tj>  and  {$  p  is  a  convergent 
sequence,  we  get  that  for  all  k  large  enough,  k  e  K 

6-  CHOl^r)  >  S.0($k~\r)  +  f  V  w  s'  £  (j)  ,  weOCi);  A  e  j  =  1, 

Also  ,  1  <  i,  j  <  N,  is  continuous  in  <$>  and  therefore  by  Lemma 

2,  (A.  16)  and  hypothesis  a),  for  all  k  large  enough,  k  e  K 


3^(3) 


3li(3i 


V  £  £  £^(j)»  3  ~ 


which  together  with  hypothesis  b)  and  the  definition  of  5($;i)(j)  implies 
that  for  all  k  large  enough,  k  e  K 

AjCj)  n  B(4.k_1;i)(j)  =  0,  3  =  1.....N  .  (A.  18) 

Lei/ma  A. 2  combined  with  (A.  17)  and  (A.  18)  implies  that  for  all  k  large 
ev.ough  k  e  K 


=  ° 


V  w  £  £• £j) ,  j  =  1>  —  jN 


(A. 19) 


and  taking  the  limit 


{ft,  fi) 


w  t  ^0), 


j  =  1,...,N 


(A.2C 


Using  (A, 20),  Lemma  A. 3  and  hypothesis  a)  we  have 


3D(»,r) 

3r.(j) 


I 

ZeZ 


0)  [Dhtfikw.«!  *  f^jf] 


i  i 


ZeL(j) 


u<»  [Diktfik*r»  ‘  ] 


=  S. .  c<w  =  S..0M 
13  13 


This  together  with  part  a)  of  the  hypothesis  establishes  a')- 


To  see  b'},  notice  that  by  continuity  of  in  <j>  and  the 

^iUJ 


preceding  equation  we  have  that  for  all  k  la?  nough,  k  e  K 


3D(»k,r)  >  3D(^k,r) 

3^(3)  3^(j) 


V  J-  £  (3) »  j  =  1 ,  •  • .  jN 


(A. 21 


Equations  (A.21)  and  (A. IS)  hold  for  all  i  e  and  b')  follows. 


Q.E.I 


By  now  we  have  developed  all  the  machinery  for  the  convergence  proof 


of  Proposition  2.  We  will  simply  make  repeated  application  of  Lc-iisna  A. 4 
for  the  proper  sequences. 


Proof  of  Proposition  2:  Take  a  to  be  as  in  Lemma  A.l,  let  a£(0,ct]  and 
v 

let  {<£  }  be  a  corresponding  sequence  generated  by  algorithm  (8) , (9) . 
k 

The  sequence  {<J>  }  belongs  to  a  compact  set  and  therefore  there  exists 

k  k-1 

a  convergent  subsequence  {<j>  -»•  <J>.  The  sequence  {<j>  has  a 

k-1  k-2 

convergent  subsequence  {<f>  •*’  <f>^>  C  K.  The  sequence  {(+> 

1  ^2  ^ 

has  a  convergent  subsequence  {<J> '  ‘*■^2*  ^1*  Proceeding 

this  way  v,a  get  a  convergent  subsequence 

t<f>  ^keKN_1'k  Vl'  KN-1C  KN-2  * 

We  have  K..  C  K  C  . .  .C  K  and 
N-l  N-2 


♦n-1  t*k’1>kel£N_1  +  {<J>k}keKN. 


By  Lemma  A. 3  the  shortest  distances  which  correspond  to 
<j>  ,  <J>  <p  are  the  same.  As  a  result,  in  what  follows,  when  we 

r»“X  ri—Z 

mention  the  set  I^(j)  we  need  not  specify  the  limit  point  to  which 
it  corresponds. 

Let  K  in  Lemma  A. 4  be  K ^  For  each  destination  j,  the  only  element 

in  I^(j)  is  j  and  therefore  the  assumptions  of  Lemma  A. 4  hold  for 
I|(j)  and  the  pairs  of  sequences 

'Aer  ^'XcP-  « **~Xsh  ^'X^'---^Xeh  ^XzP- 


Applying  Lemma  A. 4  for  q  =  1,  we  obtain  that  its  hypothesis  holds  for 
q  =  2  and  the  pairs  of  sequences  C[  {4>^}  ^  r  ^  ]  , . .  , 

.  Proceeding  this  way  we  note  that  the 
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hypothesis  of  Lemma  A.  4  holds  for  q  =  N-l  and  the  pair 
({<l>k}ke£*  {4»k-1}keK  )•  Applying  Lemma  A. 4  again  we  see  that  the 
conclusion  of  its  part  af)  holds  for  q  =  N-l,  i.e.,  equation  (A. IS) 
holds  for  IN(j),  j  =  1,...,N.  Since  every  node  in  the  network  belongs 
to  j  =  1»...»N,  it  follows  that  (A. 9)  is  satisfied,  and  hence 

<*>  is  optimal . 


Q.E.D. 


In  this  appendix  we  analyze  the  descent  properties  of  the  algorithm 
of  Section  4.  We  assume  a  single  destination  but  the  proof  extends  trivial¬ 


ly  to  the  case  where  we  have  multiple  destinations  and  the  algorithm  is 
operated  in  the  one  destination  at  a  time  mode.  In  view  of  the  fact  that 
each  function  is  strictly  convex  it  follows  that  there  is  a  unique 
optimal  set  of  total  link  flows  {f?^| (i,£)eL}.  It  is  clear  that  given 
any  e>0  there  exists  a  scalar  y£  such  that  for  all  feasible  total  link 
flow  vectors  f  satisfying 

lfU  "  £h)  -  Ye  »  Vtt.ttei  (B.  1) 

we  have 


1 

1+e 


i  Di*‘£u2 


V(i  ,DeL. 


(B.*) 


The  strict  positivity  assumption  v-  also  implies  that  for  each 

y  >0  there  exists  a  scalar  J(y  5  ^uch  that  every  feasible  f  satisfy- 

ing  1  D.»(f.„)  £  6(y  )  ai.se  satisfies  (B.l)  and  hence  also  (B.2).  Further- 

more  <5(y  )  can  be  taken  arbitrarily  large  provided  y  is  sufficiently 
£  £ 

large.  We  will  make  use  of  this  fact  in  the  proof  of  the  subsequent 
result. 


Proposition  B.l:  Let  (j>  and  <j>  be  two  successive  vectors  of  routing 

variables  generated  by  the  algorithm  of  Section  4  (with  stepsize  a=l) 

and  let  f  and  f  be  the  corresponding  vectors  of  link  flows.  Assume 

2 

that  for  some  e  with  0  <  e  <  -  lwe  have 


A 


(B.S) 


where  y  is  the  scalar  corresponding  to  e  as  in  (B.l),  (B.2),  and  6(y  ) 
is  such  that  (B.l)  [and  hence  also  (B.2)]  holds  for  all  feasible  f 
satisfying  (B.S).  Then 
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D(4»,r)  -  D(<j>,r)  <  -  p(e)  l  '  C°i ^  jj," (B.4) 

(i  jTjcL 

2 

where  p(s)  =  ,£ ~  >  0  for  all  e  with  0  <  e  <^E  -  1. 

Proof:  Let  Af  be  the  increment  of  flow  corresponding  to  the  increment 
A4>  =  <p-d ).  We  have 

D«,r)  -  DM, r)  =  i[^i£l>UCfutn“u)|Ilrt)  *  I  ^  <4fH)2[,Ht£U*nMH) 

for  some  n*e(0,l].  Denoting  D^C^+^Af.^)  =  P"  and  using  an  argument 
similar  to  the  one  employed  in  Section  4  [cf.  (28)- (31)]  we  obtain 

DW,r)-D(*,r)  =  .1  V*U<DU*5P  \l  'DVDU>4ti»uV*H 

‘I  *  CV*H>2].  CB.SJ 


We  will  derive  upper  bounds  for  each  of  the  three  terms  in  the  right 
side  of  (B.5). 

From  the  necessary  condition  for  A<Jk  to  minimize  the  function 
Qi(A«*>i)  of  (45)  subject  to  the  constraint  (24)  we  obtain 


l  t°U+  Di  ♦  WuWit)"!,  i  0 


or 


l  'DU*5P4*U  i  -  l  "i DU*«H)t4*U)2- 


(B.6) 


There  is  no  loss  of  generality  in  replacing  each  function  D 


U 


by 


n=n* 


a  function  which  is  continuously  differentiable,  is  identical  with 
on  the  set  of  flows  satisfying  (B.l)  and  is  quadratic  outside  this  set, 

provided  that,  as  part  of  the  subsequent  proof,  we  show  that 
I  *^i£ (fi£+T1^fi£,)  for  all  ne[0,l].  By  using  this  device  we  can  assume 


that  D'.'n  satisfies  (B.2)  for  all  f .  Hence  from  (B.2) 

li  v  i£ 

D"  -  D'.'  , 

— ~  <  U+e)  -  l 
i£ 


(B.7) 


D.  .  _ 

#  <  Cl+e) 2. 

ui£ 


(3.8) 


Using  (47),  (B.7),  the  Cauchy-Schwart2  inequality  and  the  arithaetic-gc-.mietric 
inequality  we  have 

1 

1  l 

1  [tl+e)2-l]  IJ5^AVU)2]2[T K'^t.^^)2]2 


^  1  ’  l 

i  [(l.e)2-ntl  Sijt^Mj,)2]2!!  DJtCt.Mu)2]2 

1  /.  1 ,  JC 

i  i  [(l*c)2-l][  I  Butt (4«u) 2  *  l  ^(tj^,)2] 

1 »  X  > 

■  |ca*ej2-i]  £  t.  I  CtjD^jCCi^)2. 


(B.9) 


Using  again  (47)  and  (B.8)  we  obtain  for  each  i 


l  SyavitJ2  *  'Vn/j 

i,£ 

<  (l  +e)2  i  I^tfiVij)2  +  Di£(vii£)2] 

1,2,  ^ 

<  (l*e)2  £  *  D^CV^n 

■  ti^J2  ?  I  (t,DJ  .?u)(i*u)2. 

1  l 


(B. 10) 
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By  combining  now  (B.5),  (B.6),  (B.9),  ?md  (B.IO)  we  obtain 


D(*,r)  -  DW.r)  <  [-l+(l+£)2-l  +  ]  I  ta(t4D'' 

~~2  2  (i,£)eL  1  1  1 


DU  +  6i  i'Mii* 


=  Cii£)el 


and  (B.4)  is  proved.  It  is  also  straightforward  to  verify  that 

p(£)  >0  for  e  in  the  interval  (0,J~  -1).  Q.E.D. 


The  preceding  proposition  shows  that  the  algorithm  of  Section  4 
does  not  increase  the  value  of  the  objective  function  once  che  flow 
vector  f  enters  a  region  of  the  form  {f|  £  <  fify^)},  and  that  the  size 

of  this  region  increases  as  the  third  derivative  of  becomes  smaller. 

Indeed  if  each  function  D^„  is  quadratic  then  (B.2)  is  satisfied  for  all 
e>0  and  the  algorithm  will  not  increase  the  value  of  the  objective  for 
all  f. 


The  preceding  analysis  can  be  easily  modified  to  show  that  if  we 

introduce  a  stepsize  a  as  in  (51)  then  the  algorithm  of  Section  4  is  a 

descent  algorithm  at  all  flows  in  the  region  (f|  J  D..(f. 0)  <  S(y  )}  where 

i,i: lJi  “  e 


0  <  £  < 


V 


izr 

2a 


-  1. 


From  this  it  follows  that  given  any  starting  point  $£$,  there  exists 
a  scalar  a>0  such  that  for  all  stepsizes  OECOtcTj  the  algorithm  of 
Section  4  does  not  increase  the  value  of  the  objective  function  at  each 


subsequent  iteration 
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